84 research outputs found
Adversarial Patch Generation for Automatic Program Repair
Automatic program repair (APR) has seen a growing interest in recent years
with numerous techniques proposed. One notable line of research work in APR is
search-based techniques which generate repair candidates via syntactic analyses
and search for valid repairs in the generated search space. In this work, we
explore an alternative approach which is inspired by the adversarial notion of
bugs and repairs. Our approach leverages the deep learning Generative
Adversarial Networks (GANs) architecture to suggest repairs that are as close
as possible to human generated repairs. Preliminary evaluations demonstrate
promising results of our approach (generating repairs exactly the same as human
fixes for 21.2% of 500 bugs).Comment: Submitted to IEEE Software's special issue on Automatic Program
Repair. Added reference
VulCurator: A Vulnerability-Fixing Commit Detector
Open-source software (OSS) vulnerability management process is important
nowadays, as the number of discovered OSS vulnerabilities is increasing over
time. Monitoring vulnerability-fixing commits is a part of the standard process
to prevent vulnerability exploitation. Manually detecting vulnerability-fixing
commits is, however, time consuming due to the possibly large number of commits
to review. Recently, many techniques have been proposed to automatically detect
vulnerability-fixing commits using machine learning. These solutions either:
(1) did not use deep learning, or (2) use deep learning on only limited sources
of information. This paper proposes VulCurator, a tool that leverages deep
learning on richer sources of information, including commit messages, code
changes and issue reports for vulnerability-fixing commit classifica- tion. Our
experimental results show that VulCurator outperforms the state-of-the-art
baselines up to 16.1% in terms of F1-score. VulCurator tool is publicly
available at https://github.com/ntgiang71096/VFDetector and
https://zenodo.org/record/7034132#.Yw3MN-xBzDI, with a demo video at
https://youtu.be/uMlFmWSJYOE.Comment: accepted to ESEC/FSE 2022, Tool Demos Trac
Adversarial Attacks on Code Models with Discriminative Graph Patterns
Pre-trained language models of code are now widely used in various software
engineering tasks such as code generation, code completion, vulnerability
detection, etc. This, in turn, poses security and reliability risks to these
models. One of the important threats is \textit{adversarial attacks}, which can
lead to erroneous predictions and largely affect model performance on
downstream tasks. Current adversarial attacks on code models usually adopt
fixed sets of program transformations, such as variable renaming and dead code
insertion, leading to limited attack effectiveness. To address the
aforementioned challenges, we propose a novel adversarial attack framework,
GraphCodeAttack, to better evaluate the robustness of code models. Given a
target code model, GraphCodeAttack automatically mines important code patterns,
which can influence the model's decisions, to perturb the structure of input
code to the model. To do so, GraphCodeAttack uses a set of input source codes
to probe the model's outputs and identifies the \textit{discriminative} ASTs
patterns that can influence the model decisions. GraphCodeAttack then selects
appropriate AST patterns, concretizes the selected patterns as attacks, and
inserts them as dead code into the model's input program. To effectively
synthesize attacks from AST patterns, GraphCodeAttack uses a separate
pre-trained code model to fill in the ASTs with concrete code snippets. We
evaluate the robustness of two popular code models (e.g., CodeBERT and
GraphCodeBERT) against our proposed approach on three tasks: Authorship
Attribution, Vulnerability Prediction, and Clone Detection. The experimental
results suggest that our proposed approach significantly outperforms
state-of-the-art approaches in attacking code models such as CARROT and ALERT
Refining ChatGPT-Generated Code: Characterizing and Mitigating Code Quality Issues
In this paper, we systematically study the quality of 4,066 ChatGPT-generated
code implemented in two popular programming languages, i.e., Java and Python,
for 2,033 programming tasks. The goal of this work is three folds. First, we
analyze the correctness of ChatGPT on code generation tasks and uncover the
factors that influence its effectiveness, including task difficulty,
programming language, time that tasks are introduced, and program size. Second,
we identify and characterize potential issues with the quality of
ChatGPT-generated code. Last, we provide insights into how these issues can be
mitigated. Experiments highlight that out of 4,066 programs generated by
ChatGPT, 2,757 programs are deemed correct, 1,081 programs provide wrong
outputs, and 177 programs contain compilation or runtime errors. Additionally,
we further analyze other characteristics of the generated code through static
analysis tools, such as code style and maintainability, and find that 1,933
ChatGPT-generated code snippets suffer from maintainability issues.
Subsequently, we investigate ChatGPT's self-debugging ability and its
interaction with static analysis tools to fix the errors uncovered in the
previous step. Experiments suggest that ChatGPT can partially address these
challenges, improving code quality by more than 20%, but there are still
limitations and opportunities for improvement. Overall, our study provides
valuable insights into the current limitations of ChatGPT and offers a roadmap
for future research and development efforts to enhance the code generation
capabilities of AI models like ChatGPT
CHRONOS: Time-Aware Zero-Shot Identification of Libraries from Vulnerability Reports
Tools that alert developers about library vulnerabilities depend on accurate,
up-to-date vulnerability databases which are maintained by security
researchers. These databases record the libraries related to each
vulnerability. However, the vulnerability reports may not explicitly list every
library and human analysis is required to determine all the relevant libraries.
Human analysis may be slow and expensive, which motivates the need for
automated approaches. Researchers and practitioners have proposed to
automatically identify libraries from vulnerability reports using extreme
multi-label learning (XML).
While state-of-the-art XML techniques showed promising performance, their
experiment settings do not practically fit what happens in reality. Previous
studies randomly split the vulnerability reports data for training and testing
their models without considering the chronological order of the reports. This
may unduly train the models on chronologically newer reports while testing the
models on chronologically older ones. However, in practice, one often receives
chronologically new reports, which may be related to previously unseen
libraries. Under this practical setting, we observe that the performance of
current XML techniques declines substantially, e.g., F1 decreased from 0.7 to
0.24 under experiments without and with consideration of chronological order of
vulnerability reports.
We propose a practical library identification approach, namely CHRONOS, based
on zero-shot learning. The novelty of CHRONOS is three-fold. First, CHRONOS
fits into the practical pipeline by considering the chronological order of
vulnerability reports. Second, CHRONOS enriches the data of the vulnerability
descriptions and labels using a carefully designed data enhancement step.
Third, CHRONOS exploits the temporal ordering of the vulnerability reports
using a cache to prioritize prediction of...Comment: Accepted to the Technical Track of ICSE 202
- …